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Abstract 

Tactics  can  be  developed  in  a  number  of  different 
ways.  Rules  can  be  created  based  on  a  theory  of 
operations  as  has  been  done  in  the  development  of 
tactical  decision  aids  for  a  considerable  time.  But 
these  tools  can  behave  poorly  in  unanticipated  sce¬ 
narios  and  can  require  significant  design  effort.  In 
this  paper,  an  existing  machine  learning  approach 
for  training  geographically  based  agents  learns  tac¬ 
tics  for  the  placement  of  surveillance  assets.  These 
results  serve  as  a  potential  benchmark  to  compare 
against  other  methods.  While  there  are  many  ap¬ 
proaches  possible,  there  are  currently  few  ways  of 
directly  comparing  these  methods  for  the  military 
environment.  It  would  be  advantageous  to  have  a 
common  set  of  benchmark  scenarios  that  could  eval¬ 
uate  different  strategies  with  respect  to  each  other. 
This  paper  presents  such  a  problem,  simulated  data, 
and  solution.  In  this  domain  a  limited  number  of 
surveillance  assets  must  autonomously  coordinate  to 
detect  several  types  of  vessels,  so  that  they  can  be 
intercepted.  The  results  show  that  a  machine  learn¬ 
ing  approach  is  able  to  consistently  locate  opposing 
vessels,  even  in  the  presence  of  noise,  but  more  im¬ 
portantly  provides  a  performance  baseline  and  guide 
for  developing  future  benchmark  problems. 

1  Introduction 

Determining  tactics  and  plans  for  command  and  con¬ 
trol  (C2)  is  a  common  problem.  Traditionally,  these 
tactics  were  devised  by  human  experts  after  care¬ 
ful  consideration  of  available  intelligence.  However, 
the  increase  of  available  information  and  demand  for 
more  complex  tactics  necessitates  the  use  of  compu¬ 
tational  resources  in  the  process.  Such  approaches 
typically  rely  on  defined  rules  and  conditions  that  can 
automatically  determine  the  appropriate  response  to 
a  situation  or  provide  guidance  for  humans.  But  the 
process  of  defining  these  criteria  is  time  consuming 
and  can  have  poor  results  when  applied  to  unantic¬ 
ipated  scenarios.  Thus,  machine  learning  (ML),  an 
artificial  intelligence  approach  that  builds  systems  by 
examining  data,  is  a  rapidly  growing  and  effective 


means  of  generating  robust  tactics  to  a  variety  of  C2 
problems. 

While  there  are  many  benefits  of  applying  ML 
to  C2,  there  are  also  many  possible  ways  to  do  so. 
Therefore  a  common  question  is  “What  is  the  best 
machine  learning  approach  for  this  problem?”  Un¬ 
fortunately,  there  is  currently  no  simple  means  of 
easily  determining  the  best  approach  without  resort¬ 
ing  to  implementing  and  trying  out  several  methods, 
which  can  require  significant  time  and  monetary  in¬ 
vestment.  Thus,  it  would  be  advantageous  to  have 
benchmark  tasks  that  could  give  a  general  idea  of  the 
effectiveness  of  various  machine  learning  approaches 
across  various  classes  of  domains. 

In  this  paper,  such  a  benchmark  task  is  presented 
in  the  form  of  a  surveillance  domain  in  which  two 
aerial  surveillance  planes  must  detect  drug  runners 
operating  in  the  waters  around  central  America.  This 
problem  provides  an  interesting  test  bed  for  machine 
learning  because  the  optimal  solution  is  not  neces¬ 
sarily  known,  and,  in  fact,  may  change  with  different 
circumstances.  Additionally,  these  planes  may  need 
to  operate  with  degraded  or  limited  communication, 
which  can  be  challenging  to  many  machine  learning 
algorithms.  However,  it  is  easy  to  judge  the  qual¬ 
ity  of  solutions  based  on  how  many  drug  runners  are 
spotted.  To  solve  this  problem,  the  machine  learning 
technique  multiagent  HyperNEAT  is  employed  to  de¬ 
termine  the  tactics  for  the  two  planes,  with  the  as¬ 
sumption  that  there  is  no  communication  between 
them.  The  results  show  that  the  planes  are  able 
to  detect  significantly  more  boats  than  hand-coded 
heuristics,  and  provides  the  opportunity  for  compar¬ 
ison  with  other  machine  learning  approaches. 


2  Background 

This  section  reviews  relevant  benchmarking  ap¬ 
proaches,  C2  for  asset  allocation  (including  multia¬ 
gent  approaches),  and  the  NEAT  and  HyperNEAT 
methods  that  form  the  backbone  of  the  multiagent 
HyperNEAT  approach. 
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2.1  Benchmarking 

Ranking  things  is  an  almost  fundamental  human  de¬ 
sire  and  it  can  be  very  important  when  trying  to  se¬ 
lect  among  a  group  of  competing  approaches  for  a 
particular  application.  However,  it  has  been  noto¬ 
riously  difficult  to  measure  the  effectiveness  of  C2 
systems  [53]  and  there  are  currently  few  effective 
means  to  do  so.  Interestingly,  the  increase  of  ma¬ 
chine  learning  and  other  computational  approaches 
to  C2  bring  with  them  a  number  of  benchmarking 
techniques  from  those  fields.  For  example,  there  are  a 
number  of  benchmark  problems  in  the  machine  learn¬ 
ing  community  where  all  approaches  try  and  learn  the 
most  information  from  the  same  set  of  data  [33]  such 
as  handwriting  recognition  [20]  or  diagnosing  diseases 
from  patient  data  [35].  Such  benchmarks  allow  the 
demonstration  of  improvement  upon  or  against  an 
existing  algorithm.  There  are  dangers  of  over-fitting 
to  the  benchmark,  so  it  is  important  that  the  bench¬ 
mark  problem  shares  a  relationship  with  (or  even  bet¬ 
ter,  actually  is)  a  real-world  problem  to  be  solved.  It 
is  also  true  that  techniques  will  do  better  on  some 
benchmarks  than  others,  but  that  can  actually  be 
helpful  in  determining  which  types  of  problems  that 
technique  is  best  suited  to  solve.  The  benchmark 
domain  proposed  in  this  paper  is  such  a  real-world 
problem  and  should  improve  the  ability  to  judge  the 
effectiveness  of  a  C2  approach. 

2.2  C2  for  Asset  Allocation 

Aside  from  human- determined  strategies,  there  are 
a  number  of  algorithm  and  machine  learning  ap¬ 
proaches  to  C2  of  assets.  One  popular  technique  is 
the  application  of  game  theory  [43].  In  these  cases, 
the  scenario  requiring  C2  is  modeled  as  an  adversar¬ 
ial  game  and  Nash  equilibrium  states  (i.e.  all  players 
cannot  do  anything  differently  without  one  of  them 
losing  points)  can  be  solved  for.  However,  the  rules 
for  different  games  are  very  strict  and  thus  may  not 
represent  real-world  scenarios  accurately.  Other  ap¬ 
proaches  involve  defining  sets  of  states  and  transi¬ 
tions  known  as  finite  state  automata  [36],  but  when 
unanticipated  states  are  encountered,  there  may  not 
be  an  adequate  response  defined.  Multiagent  Learn¬ 


ing,  discussed  in  the  next  section,  is  a  promising  ap¬ 
proach  to  asset  allocation  and  control  that  is  applied 
in  this  paper. 

2.3  Cooperative  Multiagent  Learning 

When  multiple  assets  must  be  commanded,  it  can  be 
beneficial  to  model  the  problem  as  a  multiagent  sys¬ 
tem.  That  is,  assets  and  other  important  elements  of 
the  domain  are  modeled  as  interacting  agents,  which 
allows  for  complex  simulations  and  experiments  that 
can  model  various  outcomes.  Multiagent  systems 
confront  a  broad  range  of  domains,  creating  the  op¬ 
portunity  for  real-world  applications  such  as  room 
clearing,  pursuit  [15],  and  synchronized  motion  [45]. 
In  cooperative  multiagent  learning,  which  is  reviewed 
in  this  section,  agents  are  trained  to  work  together  to 
accomplish  a  task,  usually  by  one  of  several  alterna¬ 
tive  methods.  Teams  can  sometimes  share  a  homo¬ 
geneous  control  scheme,  which  means  that  all  agents 
have  the  same  control  policy  and  thus  only  one  policy 
is  learned. 

There  are  two  primary  traditional  approaches  to 
multiagent  learning.  The  first,  multiagent  reinforce¬ 
ment  learning  (MARL),  encompasses  several  specific 
techniques  based  on  off-policy  and  on-policy  tempo¬ 
ral  difference  learning  [6,  31,  47].  The  basic  prin¬ 
ciple  that  unifies  MARL  techniques  is  to  identify 
and  reward  promising  cooperative  states  and  actions 
among  a  team  of  agents  [7,  40].  The  other  ma¬ 
jor  approach,  cooperative  coevolutionary  algorithms 
(CCEAs),  is  an  established  evolutionary  method  for 
training  teams  of  agents  that  must  work  together 
[18,  40,  41].  The  main  idea  is  to  maintain  one  or 
more  populations  of  candidate  agents,  evaluate  them 
in  groups,  and  guide  the  creation  of  new  candidate 
solutions  based  on  their  joint  performance. 

While  reinforcement  learning  and  evolution  are 
mainly  the  focus  of  separate  communities,  Panait, 
Tuyls,  and  Luke  [42]  showed  recently  that  they  share 
a  significant  common  theoretical  foundation.  One  key 
commonality  is  that  they  break  the  learning  prob¬ 
lem  into  separate  roles  that  are  semi-independent  and 
thereby  learned  separately  through  interaction  with 
each  other.  Although  this  idea  of  separating  multi¬ 
agent  problems  into  parts  is  appealing,  one  problem 
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is  that  when  individual  roles  are  learned  separately, 
there  is  no  representation  of  how  roles  relate  to  the 
team  structure  and  therefore  no  principle  for  exploit¬ 
ing  regularities  that  might  be  shared  across  all  or 
part  of  the  team.  Thus  in  cases  where  learning  has 
been  applied  to  real-world  applications,  it  usually  ex¬ 
ploits  inherent  homogeneity  in  the  task  [13,  44].  The 
approach  in  this  paper,  multiagent  HyperNEAT  ad¬ 
dresses  many  of  these  issues. 

2.4  Evolution  and  Indirect  Encodings 

In  the  context  of  reinforcement  learning  problems 
(such  as  in  multiagent  learning),  an  interesting  prop¬ 
erty  of  evolutionary  computation  (EC)  is  that  it  is 
guided  by  a  fitness  function  rather  than  from  an  error 
computation  derived  from  a  reward  prediction.  The 
independence  of  the  fitness  function  from  direct  error 
computation  has  encouraged  much  experimentation 
with  alternative  representations  because  representa¬ 
tions  in  EC  do  not  need  to  support  an  algorithm  for 
optimizing  error.  Such  freedom  has  led  to  the  advent 
of  innovative  representations  for  neural  networks  and 
also  to  novel  methods  for  encoding  complex  struc¬ 
tures,  as  described  in  this  section. 

The  specific  subfield  of  EC  that  is  implemented  in 
this  paper  is  called  neuroevolution  (NE),  which  em¬ 
ploys  EC  to  create  artificial  neural  networks  (ANNs) 
[19,  67].  In  this  approach,  the  phenotype  is  an  ANN 
and  the  genotype  is  an  implementation-dependent 
representation  of  the  ANN.  Assuming  that  the  rep¬ 
resentation  is  sufficiently  robust,  NE  can  evolve  any 
type  of  ANN,  including  recurrent  and  adaptive  net¬ 
works  [46,  52].  Early  attempts  at  NE  used  fixed- 
topology  models  that  were  designed  by  the  exper¬ 
imenter  [39].  In  the  fixed-topology  approach,  the 
genotype  is  simply  an  array  of  numbers  that  repre¬ 
sented  the  weights  of  each  connection  in  the  network. 
However,  this  approach  is  also  restrictive  because  the 
solution  may  be  difficult  to  discover  or  may  not  ex¬ 
ist  at  all  in  the  chosen  topology.  Thus  new  tech¬ 
niques  that  allowed  evolving  both  connection  weights 
and  network  topology  were  developed  [30,  32,  55]. 
One  such  method,  NeuroEvolution  of  Augmenting 
Topologies  or  NEAT,  which  is  described  next,  has 
proven  successful  and  serves  as  the  foundation  for 


the  multiagent  learning  approach  introduced  in  this 
paper. 

2.4.1  NeuroEvolution  of  Augmenting 
Topologies  (NEAT) 

The  NEAT  method  was  originally  developed  to  evolve 
ANNs  to  solve  difficult  control  and  sequential  deci¬ 
sion  tasks  [55,  57,  59],  the  basic  principles  of  NEAT 
are  reviewed  in  this  section. 

Traditionally,  ANNs  evolved  by  NEAT  control 
agents  that  select  actions  based  on  their  sensory 
inputs.  NEAT  is  unlike  many  previous  methods 
that  evolved  neural  networks,  that  is,  neuroevolu¬ 
tion  methods,  which  historically  evolved  either  fixed- 
topology  networks  [25,  48],  or  arbitrary  random- 
topology  networks  [3,  26,  67].  Instead,  NEAT  begins 
evolution  with  a  population  of  small,  simple  networks 
and  increases  the  complexity  of  the  network  topology 
into  diverse  species  over  generations,  leading  to  in¬ 
creasingly  sophisticated  behavior.  A  similar  process 
of  gradually  adding  new  genes  has  been  confirmed  in 
natural  evolution  [37,  65]  and  shown  to  improve  adap¬ 
tation  in  a  few  prior  evolutionary  [2]  and  neuroevolu¬ 
tionary  [28]  approaches.  However,  a  key  feature  that 
distinguishes  NEAT  from  prior  work  in  evolving  in¬ 
creasingly  complex  structures  is  its  unique  approach 
to  maintaining  a  healthy  diversity  of  structures  of  dif¬ 
ferent  complexity  simultaneously,  as  this  section  re¬ 
views.  This  approach  has  proven  effective  in  a  wide 
variety  of  domains  [1,  58,  60,  62].  Complete  descrip¬ 
tions  of  the  NEAT  method,  including  experiments 
confirming  the  contributions  of  its  components,  are 
available  in  Stanley  and  Miikkulainen  [55,  57]  and 
Stanley  et  al.  [59]. 

The  NEAT  method  is  based  on  three  key  ideas. 
First,  to  allow  network  structures  to  increase  in  com¬ 
plexity  over  generations,  a  method  is  needed  to  keep 
track  of  which  gene  is  which.  Otherwise,  it  is  not 
clear  in  later  generations  which  individual  is  compat¬ 
ible  with  which  in  a  population  of  diverse  structures, 
or  how  their  genes  should  be  combined  to  produce 
offspring.  NEAT  solves  this  problem  by  assigning  a 
unique  historical  marking  to  every  new  piece  of  net¬ 
work  structure  that  appears  through  a  structural  mu¬ 
tation.  The  historical  marking  is  a  number  assigned 
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to  each  gene  corresponding  to  its  order  of  appearance 
over  the  course  of  evolution.  The  numbers  are  inher¬ 
ited  during  crossover  unchanged,  and  allow  NEAT  to 
perform  crossover  among  diverse  topologies  without 
the  need  for  expensive  topological  analysis. 

Second,  NEAT  speciates  the  population  so  that  in¬ 
dividuals  compete  primarily  within  their  own  niches 
instead  of  with  the  population  at  large.  Because 
adding  new  structure  is  often  initially  disadvanta¬ 
geous,  this  separation  means  that  unique  topologi¬ 
cal  innovations  are  protected  and  therefore  have  the 
opportunity  to  optimize  their  structure  without  di¬ 
rect  competition  from  other  niches  in  the  population. 
NEAT  uses  the  historical  markings  on  genes  to  deter¬ 
mine  to  which  species  different  individuals  belong. 

Third,  many  approaches  that  evolve  network 
topologies  and  weights  begin  evolution  with  a  popula¬ 
tion  of  random  topologies  [26,  67].  In  contrast,  NEAT 
begins  with  a  uniform  population  of  simple  networks 
with  no  hidden  nodes,  differing  only  in  their  initial 
random  weights.  Because  of  speciation,  novel  topolo¬ 
gies  gradually  accumulate  over  evolution,  thereby  al¬ 
lowing  diverse  and  complex  phenotype  topologies  to 
be  represented.  No  limit  is  placed  on  the  size  to 
which  topologies  can  grow.  New  nodes  and  connec¬ 
tions  are  introduced  incrementally  as  structural  mu¬ 
tations  occur,  and  only  those  structures  survive  that 
are  found  to  be  useful  through  fitness  evaluations.  In 
effect,  then,  NEAT  searches  for  a  compact,  appropri¬ 
ate  topology  by  incrementally  adding  complexity  to 
existing  structure. 

2.4.2  CPPNs  and  HyperNEAT 

A  key  similarity  among  many  neuroevolution  meth¬ 
ods,  including  NEAT,  is  that  they  employ  a  direct 
encoding,  that  is,  each  part  of  the  solution’s  repre¬ 
sentation  maps  to  a  single  piece  of  structure  in  the 
final  solution.  For  example,  in  NEAT,  the  genome  is 
a  list  of  connections  and  nodes  in  the  neural  network 
in  which  each  item  corresponds  to  exactly  one  compo¬ 
nent  in  the  phenotype.  Yet  direct  encodings  impose 
the  significant  disadvantage  that  even  when  different 
parts  of  the  solution  are  similar,  they  must  be  en¬ 
coded  and  therefore  discovered  separately.  This  chal¬ 
lenge  is  related  to  the  problem  rediscovery  in  multi¬ 


agent  systems:  After  all,  if  individual  team  members 
are  encoded  by  separate  genes,  even  if  a  component  of 
their  capabilities  is  shared,  the  search  algorithm  has 
no  way  to  exploit  such  a  regularity.  Thus  this  pa¬ 
per  leverages  the  power  of  indirect  encoding  instead, 
which  means  that  the  description  of  the  solution  is 
compressed  such  that  information  can  be  reused,  al¬ 
lowing  the  final  solution  to  contain  more  components 
than  the  description  itself. 

For  example,  if  a  hypothetical  solution  ANN  re¬ 
quired  all  weights  to  be  set  to  1.0,  NEAT  would 
separately  have  to  discover  that  each  such  weight 
must  be  1.0  whereas  an  indirect  encoding  could  in¬ 
stead  discover  that  all  weights  should  be  the  same 
value.  Indirect  encodings  are  often  motivated  by  de¬ 
velopment  in  biology,  in  which  the  genotype  (DNA) 
maps  to  the  phenotype  (the  living  organism)  indi¬ 
rectly  through  a  process  of  growth  [4,  34,  56].  Indirect 
encodings  are  powerful  because  they  allow  solutions 
to  be  represented  as  a  pattern  of  policy  parameters, 
rather  than  requiring  each  parameter  to  be  repre¬ 
sented  individually.  This  capability  is  the  focus  of 
the  field  called  generative  and  developmental  systems 
[4,  5,  17,  29,  34,  38,  51,  54,  56]. 

HyperNEAT,  reviewed  in  this  section,  is  an  exten¬ 
sion  of  NEAT  that  allows  it  to  benefit  from  indirect 
encoding.  HyperNEAT  has  become  a  popular  neu¬ 
roevolution  method  in  recent  years  and  is  proven  in 
a  wide  range  of  domains  such  as  board  games  [21- 
24],  adaptive  maze  navigation  [46],  quadruped  loco¬ 
motion  [11],  keepaway  soccer  [63,  64]  and  a  variety  of 
others  [8-10,  12,  16,  27,  61,  66].  For  a  full  description 
of  HyperNEAT  see  Stanley  et  al.  [61]  and  Gauci  and 
Stanley  [24]. 

In  HyperNEAT,  NEAT  is  altered  to  evolve  an  in¬ 
direct  encoding  called  compositional  pattern  produc¬ 
ing  networks  (CPPNs  [54])  instead  of  ANNs.  CPPNs 
are  a  high-level  abstraction  of  the  development  pro¬ 
cess  in  nature,  intended  to  approximate  its  represen¬ 
tational  power  without  the  computational  cost.  The 
idea  is  that  regular  patterns  such  as  those  seen  in 
nature  can  be  approximated  at  a  high  level  by  com¬ 
positions  of  functions,  wherein  each  function  in  the 
composition  loosely  corresponds  to  a  canonical  event 
in  development.  For  example,  a  Gaussian  function 
is  analogous  to  a  symmetric  chemical  gradient.  Each 
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such  component  function  also  creates  a  novel  geomet¬ 
ric  coordinate  frame  within  which  other  functions  can 
reside.  For  example,  any  function  of  the  output  of 
a  Gaussian  will  output  a  symmetric  pattern  because 
the  Gaussian  is  symmetric.  In  this  way,  the  Gaussian 
is  a  coordinate  frame  like  a  chemical  gradient  in  nat¬ 
ural  development  that  provides  a  context  for  growing 
symmetric  structures. 

The  appeal  of  this  encoding  is  that  it  allows  a  rep¬ 
resentation  akin  to  developmental  processes  to  be 
encoded  as  networks  of  simple  functions  (that  is, 
CPPNs),  which  means  that  NEAT  can  evolve  CPPNs 
just  like  ANNs.  CPPNs  are  similar  to  ANNs,  but 
they  rely  on  more  than  one  activation  function  (each 
representing  a  chemical  gradient  common  to  develop¬ 
ment)  and  are  an  abstraction  of  development  rather 
than  of  brains.  Also,  unlike  other  artificial  develop¬ 
mental  encodings,  CPPNs  do  not  require  an  explicit 
simulation  of  growth  or  local  interaction,  yet  still  ex¬ 
hibit  their  essential  representational  capabilities  [54] . 

Specifically,  CPPNs  produce  a  phenotype  that  is 
a  function  of  n  dimensions,  where  n  is  the  number 
of  dimensions  of  the  desired  solution,  for  example, 
n  =  2  for  a  two-dimensional  image.  For  each  coordi¬ 
nate  in  that  space,  its  level  of  expression  is  output  by 
the  CPPN,  which  encodes  the  phenotype.  Figure  1 
shows  how  a  two-dimensional  phenotype  can  be  gen¬ 
erated  by  a  function  of  two  parameters  (x  and  y )  that 
is  represented  by  a  network  of  composed  functions 
that  produce  intensitiy  values  for  each  set  of  param¬ 
eters.  The  CPPN  in  figure  lb  actually  produces  the 
pattern  in  a.  Because  CPPNs  are  a  superset  of  tra¬ 
ditional  ANNs,  which  can  approximate  any  function 
[14],  CPPNs  are  also  universal  function  approxima¬ 
tors.  Thus  a  CPPN  can  encode  any  pattern  within 
its  n-dimensional  space. 

The  appeal  of  the  CPPN  as  an  indirect  encoding 
is  that  it  can  compactly  encode  patterns  with  regu¬ 
larities  such  as  symmetry,  repetition,  and  repetition 
with  variation  [49,  50,  54].  For  example,  simply  by  in¬ 
cluding  a  Gaussian  function,  which  is  symmetric,  the 
output  pattern  can  become  symmetric.  A  periodic 
function  such  as  sine  creates  segmentation  through 
repetition.  Most  importantly,  repetition  with  varia¬ 
tion  (for  example,  the  fingers  of  the  human  hand) 
is  easily  discovered  by  combining  regular  coordinate 
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(a)  Pattern  Encoding 


Figure  1:  CPPN  Encoding 


frames  (for  example,  sine  and  Gaussian)  with  irreg¬ 
ular  ones  (for  example,  the  asymmetric  x-axis).  For 
example,  a  function  that  takes  as  input  the  sum  of  a 
symmetric  function  and  an  asymmetric  function  out¬ 
puts  a  pattern  with  imperfect  symmetry.  In  this  way, 
CPPNs  produce  regular  patterns  with  subtle  varia¬ 
tions  reminiscent  of  many  seen  in  nature.  The  po¬ 
tential  for  CPPNs  to  represent  patterns  with  natural 
motifs  has  been  demonstrated  in  several  studies  [54] 
including  an  online  service  on  which  users  collabora- 
tively  breed  patterns  represented  by  CPPNs  [49,  50]. 

The  main  idea  in  HyperNEAT  is  that  CPPNs  can 
also  naturally  encode  connectivity  patterns  [21,  22, 
24,  61,  64].  That  way,  NEAT  can  evolve  CPPNs  that 
represent  large-scale  ANNs  with  their  own  symme¬ 
tries  and  regularities.  This  capability  will  prove  es¬ 
sential  to  encoding  multiagent  policy  geometries  in 
this  paper  because  it  will  ultimately  allow  connectiv¬ 
ity  patterns  to  be  expressed  as  a  function  of  team 
geometry,  which  means  that  a  smooth  gradient  of 
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1)  Query  each  potential  2)  Feed  each  coordinate  pair  into  CPPN 
connection  on  substrate 


Substrate 


►-  1,0 

3)  Output  is  weight 
between  (x^y^  and  (x2,y2) 


Figure  2:  Hypercube-based  Geometric  Connectivity  Figure  3:  Substrate  Configuration 

Pattern  Interpretation 


policies  can  be  produced  across  possible  agent  loca¬ 
tions.  The  key  insight  in  Hyper  NEAT  is  that  2  n- 
dimensional  spatial  patterns  are  isomorphic  to  con¬ 
nectivity  patterns  in  n  dimensions,  that  is,  in  which 
the  coordinate  of  each  endpoint  is  specified  by  n  pa¬ 
rameters,  which  means  that  CPPNs  can  express  both 
spatial  and  connectivity  patterns  with  the  same  kinds 
of  regularities. 

Consider  a  CPPN  that  takes  four  inputs  labeled 
and  y2]  this  point  in  four-dimensional 
space  also  denotes  the  connection  between  the  two- 
dimensional  points  (xi,yi)  and  (^2,^2),  and  the  out¬ 
put  of  the  CPPN  for  that  input  thereby  represents 
the  weight  of  that  connection  (Figure  2).  By  query¬ 
ing  every  possible  connection  among  a  set  of  points  in 
this  manner,  a  CPPN  can  produce  an  ANN,  wherein 
each  queried  point  is  a  neuron  position.  Because  the 
connections  are  produced  by  a  function  of  their  end¬ 
points,  the  final  structure  is  a  product  of  the  geome¬ 
try  of  these  points  and  the  CPPN  can  thus  exploit  the 
relationships  between  them  in  the  network  it  encodes. 
In  effect,  the  CPPN  is  painting  a  pattern  on  the  inside 
of  a  four-dimensional  hypercube  that  is  interpreted 
as  the  isomorphic  connectivity  pattern,  which  ex¬ 
plains  the  origin  of  the  name  hypercube-based  NEAT 
(HyperNEAT).  Connectivity  patterns  produced  by  a 
CPPN  in  this  way  are  called  substrates  so  that  they 
can  be  verbally  distinguished  from  the  CPPN  itself, 
which  has  its  own  internal  topology. 

Each  queried  point  in  the  substrate  is  a  node  in  a 
neural  network.  The  experimenter  defines  both  the 


location  and  role  (that  is,  hidden,  input,  or  output)  of 
each  such  node.  As  a  rule  of  thumb,  nodes  are  placed 
on  the  substrate  to  reflect  the  geometry  of  the  task 
[10,  11,  22,  24,  61,  64].  That  way,  the  connectivity  of 
the  substrate  is  a  function  of  the  the  task  structure. 

For  example,  the  sensors  of  an  autonomous  robot 
can  be  placed  from  left  to  right  on  the  substrate  in 
the  same  order  that  they  exist  on  the  robot  (Fig¬ 
ure  3).  Outputs  for  moving  left  or  right  can  also 
be  placed  in  the  same  order,  implying  a  relationship 
between  the  sensors  and  effectors.  Such  placement 
allows  the  CPPN  to  generate  connectivity  patterns 
easily  that  respect  the  geometry  of  the  problem,  such 
as  left-right  symmetry.  In  this  way,  knowledge  about 
the  problem  geometry  can  be  injected  into  the  search 
and  HyperNEAT  can  exploit  the  regularities  (for  ex¬ 
ample,  adjacency,  or  symmetry)  of  a  problem  that 
are  invisible  to  traditional  encodings. 

In  summary,  HyperNEAT  is  a  method  for  evolving 
ANNs  with  regular  connectivity  patterns  that  uses 
CPPNs  as  an  indirect  encoding.  This  capability  is 
important  for  multiagent  learning  because  it  provides 
a  formalism  for  producing  policies  (that  is,  the  out¬ 
put  of  the  CPPN)  as  a  function  of  geometry  (that 
is,  the  inputs  to  the  CPPN).  The  evolutionary  algo¬ 
rithm  in  HyperNEAT  is  the  same  as  NEAT  except 
that  it  evolves  CPPNs  that  encode  ANNs  instead  of 
evolving  the  ANNs  directly. 
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3  Approach 

The  approach  applied  in  this  paper  combines  two 
existing  extensions  to  the  HyperNEAT  method: 
HyperNEAT-BEV  and  multiagent  HyperNEAT, 
both  of  which  are  described  in  this  section. 

3.1  HyperNEAT-BEV 

The  Bird’s  Eye  View  (BEV)  extension  to  Hyper¬ 
NEAT  allows  it  to  accept  complex  information  as 
input  represented  as  a  static  overhead  view,  which 
is  a  a  common  approach  for  visualizing  and  under¬ 
standing  C2  domains  for  humans.  This  extension 
was  applied  by  Verbancsics  and  Stanley  [64]  in  sev¬ 
eral  strategic  decision  domains  such  as  Robocup  soc¬ 
cer.  This  approach  is  advantageous  because  it  allows 
HyperNEAT  to  exploit  patterns  in  the  information 
as  a  whole,  rather  than  piecewise.  Inputting  infor¬ 
mation  in  this  manner  is  also  beneficial  because  the 
state-space  remains  fixed  rather  than  changing  ego- 
centrically. 

In  previous  applications  of  BEV,  while  there  were 
multiple  agents  present,  only  one  would  act  per  time 
step.  In  this  paper,  multiple  agents  must  act  simul¬ 
taneously,  and  thus  multiple  outputs  are  required. 
Therefore  the  multiagent  HyperNEAT  paradigm  is 
employed  to  incentivize  cooperation  and  coordination 
among  their  behaviors. 


(a)  Homogeneous  CPPN  (b)  Heterogeneous  CPPN 


1  Ay 


(c)  Homogeneous  Substrate  (d)  Heterogeneous  Substrate 


Figure  4:  Multiagent  HyperNEAT  Encoding 


3.2.1  Pure  Homogeneous  Teams 


3.2  Multiagent  HyperNEAT 

Multiagent  HyperNEAT  is  based  on  the  idea  that 
a  team  of  cooperating  agents  can  be  defined  by  de¬ 
scribing  the  relationship  of  policies  to  each  other  (re¬ 
ferred  to  as  the  team’s  policy  geometry.  To  under¬ 
stand  how  the  policy  geometry  of  a  team  can  be  en¬ 
coded,  it  helps  to  begin  by  considering  homogeneous 
teams ,  which  in  effect  express  a  trivial  policy  geome¬ 
try  in  which  the  same  policy  is  uniformly  distributed 
throughout  the  team  at  all  positions.  Thus  this  sec¬ 
tion  begins  by  exploring  how  teams  of  purely  homo¬ 
geneous  agents  can  be  evolved  with  an  indirect  encod¬ 
ing,  and  then  transitions  to  the  method  for  evolving 
heterogeneous  teams  that  are  represented  by  a  single 
genome  in  HyperNEAT. 


A  homogeneous  team  only  requires  a  single  controller 
that  is  copied  once  for  each  agent  on  the  team.  To 
generate  such  a  controller,  a  four-dimensional  CPPN 
with  inputs  £i,?/i,£2,  and  y<i  (Figure  4a)  queries  the 
substrate  shown  in  Figure  4c,  which  has  five  inputs, 
five  hidden  nodes,  and  three  output  nodes,  to  deter¬ 
mine  its  connection  weights.  This  substrate  is  de¬ 
signed  to  correlate  sensors  to  corresponding  outputs 
geometrically  (for  example,  seeing  something  on  the 
left  and  turning  left).  Thus  the  CPPN  can  exploit  the 
geometry  of  the  agent  [61]  when  generating  the  ANN 
controller.  However,  the  agents  themselves  have  ex¬ 
actly  the  same  policy  no  matter  where  they  are  posi¬ 
tioned.  Thus  while  each  agent  is  informed  by  geom¬ 
etry,  their  policies  cannot  differentiate  genetically. 
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3.2.2  Teams  on  the  Continuum  of  Hetero¬ 
geneity 

Heterogeneous  teams  are  a  greater  challenge;  how  can 
a  single  CPPN  encode  a  set  of  networks  in  a  pattern, 
all  with  related  yet  varying  roles?  Indirect  encodings 
such  as  HyperNEAT  are  naturally  suited  to  captur¬ 
ing  such  patterns  by  encoding  the  policy  geometry 
of  the  team  as  a  pattern.  The  remainder  of  this  sec¬ 
tion  discusses  the  method  by  which  HyperNEAT  can 
encode  such  teams. 

The  main  idea  is  that  the  CPPN  is  able  to  create 
a  pattern  based  on  both  the  agent’s  internal  geom¬ 
etry  (x  and  y)  and  its  position  on  the  team  (z)  by 
incorporating  an  additional  input  (Figure  4b, d).  The 
CPPN  can  thus  emphasize  connections  from  z  for  in¬ 
creasing  heterogeneity  or  minimize  them  to  produce 
greater  homogeneity.  Furthermore,  because  2:  is  a 
spatial  dimension,  the  CPPN  can  literally  generate 
policies  based  on  their  positions  on  the  team.  Note 
that  because  2  is  a  single  dimension,  the  policy  ge¬ 
ometry  of  this  team  (and  those  in  this  paper)  is  on 
a  one-dimensional  line.  However,  in  principle,  more 
inputs  could  be  added,  allowing  two-  or  more  dimen¬ 
sional  policy  geometry  to  be  learned  as  well. 

The  heterogeneous  substrate  (Figure  4d)  formal¬ 
izes  the  idea  of  encoding  a  team  as  a  pattern  of 
policies.  This  capability  is  powerful  because  gener¬ 
ating  each  agent  with  the  same  CPPN  means  they 
can  share  tactics  and  policies  while  still  exhibiting 
variation  across  the  policy  geometry.  In  other  words, 
policies  are  spread  across  the  substrate  in  a  pattern 
just  as  role  assignment  in  a  human  team  forms  a  pat¬ 
tern  across  a  field.  However,  even  as  roles  vary,  many 
skills  are  shared,  an  idea  elegantly  captured  by  indi¬ 
rect  encoding. 

Importantly,  the  complexity  of  the  CPPN  is  inde¬ 
pendent  of  the  number  of  agents  in  the  substrate, 
which  is  a  benefit  of  indirect  encoding.  Therefore,  in 
principle,  teams  with  a  high  number  of  agents  can 
be  trained  without  the  additional  cost  that  would  in¬ 
cur  to  traditional  methods.  Another  key  property  of 
the  heterogeneous  substrate  is  that  if  a  new  network 
is  added  to  the  substrate  at  an  intermediate  loca¬ 
tion,  its  policy  can  theoretically  be  interpolated  from 
the  policy  geometry  embodied  in  the  CPPN.  Thus, 


as  the  next  section  describes,  it  becomes  possible  to 
scale  teams  without  further  training  by  interpolating 
new  roles. 

4  Surveillance  Experiment 

In  this  domain,  two  planes  (P-3  AIPs)  must  patrol 
the  waters  of  Central  America  with  the  goal  of  spot¬ 
ting  boats  that  are  smuggling  narcotics  over  a  72  hour 
period.  The  area  that  must  be  searched  is  vast  and 
separated  by  land,  so  the  planes  must  cooperate  to 
effectively  cover  it.  Of  benefit  to  the  planes,  however, 
is  that  there  is  some  degree  of  information  about  the 
smugglers,  although  much  of  it  is  uncertain.  Hyper¬ 
NEAT  is  a  natural  fit  for  this  problem  because  of 
the  large  input  space  and  cooperative  nature  of  the 
problem.  There  are  two  agent  types  to  be  considered: 
smugglers  and  searchers. 

Planning  asset  allocation  depends  largely  on  in¬ 
telligence  gathered.  In  this  scenario,  the  important 
information  known  about  the  smuggler  boats  is  the 
path  they  will  take,  the  time  they  will  depart,  and  the 
speed  at  which  they  will  travel.  The  paths  all  begin 
in  South  America,  lead  to  a  point  in  Central  Amer¬ 
ica,  and  are  defined  by  one  to  four  waypoints  that 
the  boats  will  travel  through.  It  is  assumed  that  the 
boats  will  take  the  most  direct  route  to  each  way- 
point.  Both  the  waypoints  and  the  departure  time 
have  a  known  uncertainty  value,  which  are  used  to 
generate  probability  distribution  heatmaps  that  rep¬ 
resent  the  chance  that  a  smuggler  boat  will  be  at 
a  given  location  for  given  time,  which  is  dependent 
on  the  uncertainty  associated  with  the  intelligence 
related  to  the  boats.  Figure  5  shows  an  example 
heat  map  with  multiple  possible  boat  tracks.  In  this 
experiment,  72  (one  for  each  hour  of  the  scenario) 
such  distributions  are  created  at  a  1  degree  by  1  de¬ 
gree  resolution  and  serve  as  the  basis  for  deciding  the 
patrol  routes  of  the  searchers. 

The  searcher  planes  both  start  at  the  same  location 
(89,13)  and  can  move  up  to  325  nautical  miles  every 
hour.  To  determine  the  patrol  route  the  planes  will 
take,  a  substrate  is  defined  that  takes  the  probability 
heatmaps  defined  above  as  input  at  z  =  0.  The  out¬ 
puts  are  two  sheets  of  neurons  at  z  =  1  and  —1  that 


are  the  same  size  as  the  heatmaps  above  and  repre¬ 
sent  a  desired  location  to  move  to  (figure  6).  The 
substrate  takes  as  input  a  heatmap  at  z  =  0,  which 
is  fully  connected  to  two  equally  sized  output  maps 
at  z  =  —1  and  z  =  1.  This  substrate  construction  is 
slightly  different  than  the  traditional  multiagent  Hy- 
perNEAT  approach  in  section  3.2,  and  would  likely 
require  additional  input  dimension  for  more  than  two 
planes.  The  highest  output  neuron  within  a  plane’s 
movement  radius  is  where  the  plane  will  move  in  the 
next  hour.  This  process  is  repeated  for  each  timestep 
to  produce  the  patrol  route  for  both  planes.  A  smug¬ 
gler  boat  is  considered  spotted  if  it  and  a  plane  oc¬ 
cupy  the  same  cell  at  the  same  hour.  Importantly, 
this  approach  does  not  assume  communication  be¬ 
tween  the  planes,  that  is  the  policy  of  one  plane  does 
not  depend  on  the  actions  of  another  plane;  they  are 
both  defined  by  the  substrate.  Thus,  environments 
with  limited,  intermittent,  or  even  non-existent  com¬ 
munication  could  benefit  from  a  multiagent  Hyper- 
NEAT  approach,  because  it  allows  the  agents  to  ef¬ 
fectively  cooperate  without  explicit  communication. 

To  evaluate  the  patrols,  25  Monte  Carlo  simula¬ 
tions  are  run  and  assigned  a  fitness  based  on  the  num¬ 
ber  of  boats  spotted  and  how  close  the  plane  was  to 
spotting  the  unspotted  boats.  Thus,  where  B  is 
the  number  of  boats  is  added  to  the  fitness  for  each 
spotted  boat  and  0.1  x  1  —  x  ^  where  d &  is  the 
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Figure  6:  Neural  Network  Substrate 
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6  Discussion  and  Future  Work 
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Figure  7:  Training  Results 


closest  a  plane  got  to  the  given  boat  and  dmax  is 
the  maximum  distance  a  boat  could  be  from  a  plane 
for  each  unspotted  boat.  This  approach  allows  for 
a  smoother  search  gradient  by  partially  rewarding 
strategies  that  come  close  to  discovering  smugglers. 
Note  that  the  input  heatmaps  are  pre-generated  and 
not  affected  by  the  actual  behavior  of  the  boats  dur¬ 
ing  the  simulation.  Thus  the  planes  must  come  up 
with  patrol  routes  that  will  work  in  all  situations. 


5  Results 

The  results  show  that,  on  average,  strategies  that 
spot  90%  of  boats  are  found  within  149  generations, 
although  the  best  run  found  such  solutions  in  only 
43  generations  (figure  7).  When  compared  to  the 
hand-coded  strategy  of  moving  towards  the  area  of 
highest  probability,  while  minimizing  coverage  over¬ 
lap,  evolved  solutions  are,  on  average,  significantly 
better  within  26  generations  (p  <  0.05  by  Student’s  t- 
test).  To  test  generalization,  the  best  strategies  from 
each  generation  were  further  evaluated  on  250  Monte 
Carlo  simulations  that  had  not  been  seen  in  train¬ 
ing  and  the  average  performance  was  not  significantly 
different  (p  <  0.05)  than  training. 


The  results  show  that  multiagent  HyperNEAT  can 
find  effective  patrol  routes  for  this  domain.  Quali¬ 
tatively,  different  evolutionary  runs  produced  several 
different  types  of  solutions.  The  two  planes  would 
always  initially  split  up,  but  from  there  several  tac¬ 
tics  were  employed.  Some  planes  found  high  traffic 
locations  and  stayed  in  those  general  areas,  whereas 
other  planes  moved  around  the  entire  map  frequently. 
Imposing  additional  goals  or  costs  (e.g.  fuel  expen¬ 
diture,  coverage,  etc.)  could  shift  the  ideal  policy 
closer  to  one  of  these  extremes,  but  the  fact  that  there 
are  multiple  effective  solutions  in  this  general  case 
demonstrates  the  utility  of  evolutionary  approaches 
in  finding  “creative”  solutions  to  problems. 

There  are  several  future  research  directions  based 
on  this  work.  Firstly,  the  C2  problem  could  be 
extended  to  include  additional  agent  types  such  as 
friendly  boats  that  will  actually  intercept  detected 
smugglers.  An  alternative  extension  would  be  to  in¬ 
crease  the  accuracy  of  the  solutions  by  inputting  and 
outputting  higher  resolution  heatmaps  and  scaling 
up  the  substrate,  as  has  been  done  in  many  other 
HyperNEAT  problems  [24].  However,  the  ANNs  in 
this  paper  were  already  quite  large  (over  1.5  mil¬ 
lion  connections),  so  training  larger  networks  could 
be  a  time  consuming.  Finally,  applying  other  C2  ma¬ 
chine  learning  approaches  and  comparing  them  to  the 
HyperNEAT  results  would  provide  interesting  bench¬ 
marking  opportunities. 


7  Conclusion 

This  paper  presented  a  relevant  C2  domain  in  the 
form  of  determining  patrol  routes  for  planes  searching 
for  drug  smugglers  and  a  machine  learning  approach 
to  solving  it.  The  approach,  multiagent  HyperNEAT, 
was  able  to  find  multiple  effective  routes  that  sig¬ 
nificantly  outperformed  hand-coded  heuristics.  Ulti¬ 
mately  this  domain  can  serve  as  a  new  benchmark 
domain  for  comparing  C2  approaches  in  the  future. 
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▼  Defining  tactics  for  C2  is  a  complex  task 

T  Increased  available  information  makes  it  even  harder 

▼  A  major  problem  is  allocating  assets  for  surveillance  or 
defense 
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K~,  Applying  Machine  Learning 


T  Artificial  intelligence  (Al)  and  Machine  Learning  (ML)  can 
mitigate  this  difficulty 

T  However,  it  can  be  difficult  to  assess  their  applicability  and 
effectiveness 

▼  This  presentation  demonstrates  a  ML  technique  for  asset 
allocation  and  proposes  a  domain  to  evaluate  such 
approaches 
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Outline 


T  Background:  Evolutionary  Computation 

▼  Background:  HyperNEAT 

T  Approach:  Multiagent  HyperNEAT 

▼  Patrol  Experiment 
T  Results 

▼  Discussion  and  Conclusion 
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Evolutionary  Algorithms 


Create 
Population 


Evaluate 
Population 


Select 

Parents 
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Example:  Survival  of  the  Roundest 


Gen  3 


Selected  as  parents 
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Generative  and  Developmental  Systems  (GDS) 


▼  Virtual  DNA 


▼  Motivated  by  biological 
development 

T  Exploit  patterns  and  reuse 
information 

T  Describe  a  solution 
through  a  mapping 


Gene  Expression  in  a  Fruit  Fly  Embryo 
Meinhardt,  88 
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Compositional  Pattern  Produc  ng  Networks  (CPPNs) 


(applied  at 
each  poi 
value 
at  x,y 


n  n 


x 


▼  Introduced  by  Stanley  (2007) 

▼  Composes  functions  that 
represent  events  in 
development 

■  An  abstraction  of  development 

▼  CPPN  takes  a  coordinate  as 
input 

▼  Outputs  a  weight  for  that 
coordinate 

▼  Applying  at  all  points  creates  a 
pattern  in  space 

■  In  this  case  a  2D  image 

▼  Sampling  possible  at  any 
resolution  or  dimensionality 
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HyperNEAT 


T  Hypercube-based  NeuroEvolution  of  Augmenting 
Topologies  (Stanley,  D’Ambrosio,  and  Gaud  2009) 

■  Co-invented  by  myself,  Gaud,  and  Stanley 

▼  An  abstraction  of  embryo  development 

▼  Combines  Compositional  Pattern  Producing  Networks 
(CPPNs)  and  NEAT  (Neuroevolution  of  Augmenting 
Topologies) 

▼  Uses  geometric  information  to  create  a  neural  network 
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▼  ANNs  are  made  up  of  weighted  connections 

▼  A  connection  can  be  defined  by  its  end  points 

■  {x1,y1},{x2,y2} 

T  A  four-dimensional  CPPN  gives  us: 

■  CPPN(x1  ,y1  ,x2,y2)  =  4D  pattern  or  2D  connection  pattern 


-l\  ^  3)  Output  is  weight 

_1  Substrate  1  between  (x^yp  and  (x?,y2) 
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Substrates 
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T  Substrate  is  a  geometric 
arrangement  of  neurons 
for  an  ANN 

▼  The  neurons  are  arranged 
to  exploit  the  geometry  of 
the  problem 

■  e.g.  Left  sensor  related  to 
left  effector 

▼  Can  be  any  size,  shape, 
or  dimensionality 
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Checkers  by  Gauci 


Robocup  by  Verbancsics 
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Teams  and  Policy  Geometry 


▼  Teams  have  a  geometry 

T  Introduce  the  concept  of 
policy  geometry,  that  is, 
how  policies  are 
distributed  among  the 
team 

▼  Teammates  share  a 
number  of  skills 

▼  Goal:  Generate  policies  as 
a  function  of  geometry 
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1  A  Y 


▼  Extends  HyperNEAT  to 
elegantly  encode  multiple 
agents  in  a  single  genome 

▼  Homogeneous  team 

■  A  substrate  representing  a 
single  agent  is  used 

■  {xl  ,y1  ,x2,y2}  input  to  CPPN 
generating  connectivity  pattern 

■  The  generated  ANN  is  copied 
to  all  agents  on  the  team 

■  Performance  on  the  task  is 
tested 
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Heterogeneous  Teams 
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▼  Add  new  dimension  ‘z’ 

▼  Creates  a  stack  of 
networks 

▼  Allows  weights  be 
computed  as  a  result  of 
location  within  an  (x,y) 
agent  and  within  the 
team  (z) 
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T  Based  on  problem  encountered  by  Joint  Interagency  Task 
Force  South  (JIATF-S) 

▼  Contraband  transported  from  South  to  Central  America 

▼  Need  to  detect  and  interdict  vessels  with  contraband 
T  Act  on  intelligence 

▼  Problem  is  difficult  to  solve,  but  easy  to  evaluate 
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Data  Set 
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Patrol  Experiment 


▼  Goal:  Successfully  detect 
vessels  with  planes 

■  2  P-3  AlPs  with  visual  range  of 
60nm 


Limited  information  on 
contraband  carrying  vessels 

■  With  uncertainty 

Patrol  the  area  and  detect 
as  many  boats  as  possible 
in  72  hour  period 

nput:  Probability  of  vessel 
oeing  at  location  at  a  given 
time 


T  Output:  Where  each  plane 
should  go  at  current  time 
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Snapshot  at  hour  57 
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Substrate  Design 

▼  Heatmap  probability  acts  as  input 
T  Divided  into  1x1  degree  grid  cells 

▼  Input  layers  connect  to  two  output  layers 

T  The  highest  activation  on  each  layer  is  where 
the  plane  will  go  next 

T  Compare  to  fixed  policy  of  always  moving 
towards  highest  probability 
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Results 
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Discussion 


T  HyperNEAT  quickly  found  effective  patrol  routes 

▼  Found  a  variety  of  solutions: 

■  Some  found  high  traffic  areas  and  stuck  close 

■  Some  moved  rapidly  around  the  map 

▼  May  need  to  include  additional  costs  (e.g.  fuel) 
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▼  Include  planning  for  interdiction  (friendly  ships) 

T  Substrate  scaling  for  increased  accuracy/faster  training 

▼  Comparison  to  other  C2  approaches 
■  e.g.  human  designed  solutions 
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Conclusions 


T  Presented  a  relevant  C2  domain 

▼  Demonstrated  a  machine  learning  approach  to  solving  the 
domain 

▼  Opens  the  door  for  future  comparison  and  additional 
benchmark  tasks 
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Thank  You 


Questions? 
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